1 TGCTGGGGCA CCTGAAGGAG ACTTGGGGGC ACCCGCGTCG TGCCTCCTGG 
51 GTTGTGAGGA GTCGCCGCTG CCGCCACTGC CTGTGCTTCA TGAGGAAGAT 
101 GCTCGCCGCC GTCTCCCGCG TGCTGTCTGG CGCTTCTCAG AAGCCGGCAA 
151 GCAGAGTGCT GGTAGCATCC CGTAATTTTG CAAATGATGC TACATTTGAA 
201 ATTAAGAAAT GTGACCTTCA CCGGCTGGAA GAAGGCCCTC CTGTCACAAC 
251 AGTGCTCACC AGGGAGGATG GGCTCAAATA CTACAGGATG ATGCAGACTG 
301 TACGCCGAAT GGAGTTGAAA GCAGATCAGC TGTATAAACA GAAAATTATT 
351 CGTGGTTTCT GTCACTTGTG TGATGGTCAG TTTCTCCTTC CTCTAACACA 
401 GGAAGCTTGC TGTGTGGGCC TGGAGGCCGG CATCAACCCC ACAGACCATC 
451 TCATCACAGC CTACCGGGCT CACGGCTTTA CTTTCACCCG GGGCCTTTCC 
501 GTCCGAGAAA TTCTCGCAGA GCTTACAGGA CGAAAAGGAG GTTGTGCTAA 
551 AGCGAAAGGA GGATCGATGC ACATGTATGC CAAGAACTTC TACGGGGGCA 
601 ATGGCATCGT GGGAGCGCAG GTGCCCCTGG GCGCTGGGAT TGCTCTAGCC 
651 TGTAAGTATA ATGGAAAAGA TGAGGTCTGC CTGACTTTAT ATGGCGATGG 
701 TGCTGCTAAC CAGGGCCAGA TATTCGAAGC TTACAACATG GCAGCTTTGT 
751 GGAAATTACC TTGTATTTTC ATCTGTGAGA ATAATCGCTA TGGAATGGGA 
801 ACGTCTGTTG AGAGAGCGGC AGCCAGCACT GATTACTACA AGAGAGGCGA 
851 TTTCATTCCT GGGCTGAGAG TGGATGGAAT GGATATCCTG TGCGTCCGAG 
901 AGGCAACAAG GTTTGCTGCT GCCTATTGTA GATCTGGGAA GGGGCCCATC 
951 CTGATGGAGC TGCAGACTTA CCGTTACCAC GGACACAGTA TGAGTGACCC 
1001 TGGAGTCAGT TACCGTACAC GAGAAGAAAT TCAGGAAGTA AGAAGTAAGA 
1051 GTGACCCTAT TATGCTTCTC AAGGACAGGA TGGTGAACAG CAATCTTGCC 
1101 AGTGTGGAAG AACTAAAGGA AATTGATGTG GAAGTGAGGA AGGAGATTGA 
1151 GGATGCTGCC CAGTTTGCCA CGGCCGATCC TGAGCCACCT TTGGAAGAGC 
1201 TGGGCTACCA CATCTACTCC AGCGACCCAC CTTTTGAAGT TCGTGGTGCC 
1251 AATCAGTGGA TCAAGTTTAA GTCAGTCAGT TAAGGGGAGG AGAAGGAGAG 
13 01 GTTATACCTT CAGGGGGCTA CCAGACAGTG TTCTCAACTT GGTTAAGGAG 
1351 GAAGAAAACC CAGTCAATGA AATTCAATGA AATTCTTGGA AACTTCCATT 
1401 AAGTGTGTAG ATTGAGCAGG TAGTAATTGC ATGCAGTTTG TACATTAGTG 
1451 CATTAAAAGA TGAATTATTG AGTGCTTAAA AAAAAAAAAA AAAAAAAAAA 
1501 AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAA (SEQ ID NO:l) 

FEATURES : 

5'UTR: 1-89 

Start Codom 90 

Stop Codori: 1281 

3'UTR: 1284 



Homologous proteins: 

Top 10 BLAST Hits 



CRA 
CRA 
CRA 
CRA 
CRA 
CRA 
CRA 
CRA 
CRA 
CRA 



18000004925454 
18000004920128 
18000004938217 
18000004939896 
18000004949905 
18000004885327 
18000004969398 
18000005012775 
18000004884262 
18000004925713 



/altid=gi 
/altid=gi 
/altid=gi 
/altid=gi 
/altid=gi 
/altid=gi 
/altid=gi 
/altid=gi 
/altid=gi 
/altid=gi 



|387011 /def=gb|AAA60055.l| (J03503. 
| 4505685 /def=ref |NP_000275.l| pyru. 
| 6679261 /def=ref j NP_032836 . 1 | pyru. 
|66035 /def=pir| |DERTP1 pyruvate de. 
1 129064 / def = sp j P2 6 2 84 | ODPA_RAT PYR. 
j 266686 /def=sp|P29804|ODPA_PIG PYR. 
|448580 /def=prf | | 1917268A pyruvate. 
|1079460 /def=pir| |A49360 pyruvate . 
| 1709452 /def=sp|P52900|ODPA_SMIMA . 
| 4885543 /def =ref | NP_005381 . 1 | pyru. 
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BLAST hits to dbEST : 

gi 1 10991237 /dataset=dbest /taxon=96.. 
gi | 14051054 /dataset=dbest /taxon=960. 
gi | 14076211 /dataset=dbest /taxon=960. 
gi | 11251518 /dataset=dbest /taxon=96.. 
gi j 13914836 /dataset=dbest /taxon=960. 
gi j 2539160 /dataset=dbest /taxon=9606 
gi j 3214685 /dataset=dbest /taxon=9606 
gi j 5933458 /dataset=dbest /taxon=9606 
gi j 4988948 /dataset=dbest /taxon=9606 
gi j 4900594 /dataset=dbest /taxon=9606 
gi j 4534604 /dataset=dbest /taxon=9606 
gi j 7455087 /dataset=dbest /taxon=9606. 
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EXPRESSION INFORMATION FOR MODULATORY USE: 

library source: 

Expression information from BLAST dbEST hits: 

gi | 10991237 Neuronal precursor cells- teratocarcinoma 

gi | 14051054 skin 

gi j 14076211 skin melanotic melanoma, high MDR (cell line) 

gi j 11251518 muscle rhabdomyosarcoma 

gi|l3914836 brain neuroblastoma, cell line 

gi j 2539160 whole brain 

gi | 3214685 breast 

gi | 5933458 stomach 

gi j 4988948 pancreas - adenocarcinoma 

gi j 4900594 uterus - serous papillary carcinoma, high grade 

gi 1 4534604 brain - anaplastic oligodendroglioma 

gij 7455087 colon - moderately-differentiated adenocarcinoma 

Tissue source of cDNA clone: 
Fetal whole brain 
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1 MRKMLAAVSR VLSGASQKPA SRVLVASRNF ANDATFEIKK CDLHRLEEGP 
51 PVTTVLTRED GLKYYRMMQT VRRMELKADQ LYKQKIIRGF CHLCDGQFLL 
101 PLTQEACCVG LEAGINPTDH LITAYRAHGF TFTRGLSVRE ILAELTGRKG 
151 GCAKAKGGSM HMYAKNFYGG NGIVGAQVPL GAGIALACKY NGKDEVCLTL 
201 YGDGAANQGQ IFEAYNMAAL WKLPCIFICE NNRYGMGTSV ERAAASTDYY 
251 KRGDFIPGLR VDGMDILCVR EATRFAAAYC RSGKGPILME LQTYRYHGHS 
301 MSDPGVSYRT REEIQEVRSK SDPIMLLKDR MVNSNLASVE ELKEIDVEVR 
351 KEIEDAAQFA TADPEPPLEE LGYHIYSSDP PFEVRGANQW IKFKSVS (SEQ ID NO: 2) 

FEATURES : 

Functional domains and key regions: 

[1] PDOC00005 PS00005 PKC_PHOSPHO_SITE 
Protein kinase C phosphorylation site 

Number of matches: 7 

1 16-18 SQK 

2 70-72 TVR 

3 137-139 SVR 

4 146-148 TGR 

5 282-284 SGK 

6 293-295 TYR 

7 307-309 SYR 

(2] PDOC00006 PS00006 CK2_PHOSPHO_SITE 
Casein kinase II phosphorylation site 

Number of matches: 7 

1 57-60 TRED 

2 137-140 SVRE 

3 238-241 TSVE 

4 300-303 SMSD 

5 310-313 TREE 

6 319-322 SKSD 

7 338-341 SVEE 

[3] PDOC00008 PS00008 MYRISTYL 
N-myristoylation site 

Number of matches: 7 

1 110-115 GLEAGI 

2 114-119 GINPTD 

3 151-156 GCAKAK 

4 172-177 GIVGAQ 

5 181-186 GAG I AIi 

6 183-188 GIALAC 

7 235-240 GMGTSV 

[4] PDOC00009 PS00009 AMIDATION 
Amidation site 

146-149 TGRK 

[5] PDOC00016 PS00016 RGD 
Cell attachment sequence 

252-254 RGD 



Membrane spanning structure and domains; 

Helix Begin End Score Certainty 
1 169 189 1.097 Certain 



FIGURE 2A 



BLAST Alignment to Top Hit: 

>CRA| 18000004925454 /altid=gi | 387011 /def =gb| AAA60055 . 1 | (J03503) 

pyruvate dehydrogenase El -alpha precursor [Homo sapiens] 
/org=Homo sapiens /taxon=9606 /dataset=nraa /length=414 
Length =414 

Score = 846 bits (2163), Expect = 0.0 

Identities - 411/421 {97%), Positives = 411/421 (97%) 

Frame = +3 

Query: 18 ETWGHPRRASVAA^SRRCRHCLCFMRKMLAAVSRVLSGASQKPASRVLVASRNFANDATF 197 

ETWGHPRRASWWRSRRCRHCLCFl^mLAAVSRVLSGASQKPASRVLVASRNFANDATF 
Sbjct: 1 ETWGHPRRASWV^SRRCRHCLCFMRKMLAAVSRVLSGASQKPASRVLVASRNFANDATF 60 

Query: 198 E I KKCDLHRLEEGPPVT WLTREDGLKYYRMMQTVTU^MELKADQLYKQKI IRGFCHLCDG 377 

E I KKCDLHRLEEGPPVTTVLTREDGLKYYRMMQTVRRMELKADQLYKQKI I RGFCHLCDG 
Sbjct: 61 E I KKCDLHRLEEGPPVTTVLTREDGLKYYRMMQTVRRMELKADQL YKQKI I RGFCHLCDG 120 

Query: 378 QFLLPLTQEACCVGLEAGINPTDHLITAYRAHGFTFTRGLSTOEIIJVELTGRKGGCAKAK 557 

Q EACCVGLEAGINPTDHLI TAYRAHGFTFTRGLSVRE I LAELTGRKGGCAK K 

Sbjct: 121 Q EACCVGLEAGINPTDHLI TAYRAHGFTFTRGLSVRE I LAELTGRKGGCAKGK 173 

Query: 558 GGSMHMYAKNFYGGNGIVGAQVPLGAGIALACKYNGKDEVCLTLYGDGAANQGQIFEAYN 737 

GGSMHMYAKNFYGGNG I VGAQVPLGAGI ALACKYNGKDEVCLTL YGDGAANQGQ I FEAYN 
Sbjct: 174 GGSMHMYAKNFYGGNGI VGAQVPLGAGI ALACKYNGKDEVCLTL YGDGAANQGQ I FEAYN 233 

Query: 73 8 MAALWKLPCI FI CENNRYGMGTSVERAAASTDYYKRGDFI PGLRVDGMDI LCVREATRFA 917 

MAALWKLPCI FI CENNRYGMGTSVERAAASTDYYKRGDFI PGLRVDGMDI LCVREATRFA 
Sbjct: 234 MAALWKLPCI FI CENNRYGMGTSVERAAASTDYYKRGDFI PGLRVDGMDI LCVREATRFA 293 

Query: 918 AAYCRS GKGP I LME LQT YR YHGHSMSDPGVS YRTREE I QE VRS KSD P I MLLKDRMVNSNL 1097 

AAYCRSGKGPILMELQTYRYHGHSMSDPGVSYRTREEIQEVRSKSDPIMLLKDRMVNSNL 
Sbjct: 294 AAYCRSGKGPILMELQTYRYHGHSMSDPGVSYRTREEIQEVRSKSDPIMLLKDRMVNSNL 353 

Query: 1098 ASVEELKEIDVEVRKEIEDAAQFATADPEPPLEELGYHIYSSDPPFEVRGANQWIKFKSV 1277 

ASVEELKEIDVEVRKEIED AQFA ADPEPPLEELGYHIYSSDPPFEVRGANQWIKFKSV 
Sbjct: 354 ASVEELKEIDVEVRKEIED PAQFAAADPEPPLEELGYHIYSSDPPFEVRGANQWIKFKSV 413 

Query: 1278 S 1280 
S 

Sbjct: 414 S 414 (SEQ ID NO:4) 



>CRA| 18000004920128 /altid=gi | 4505685 /def =ref | NP_000275 . 1 | pyruvate 
dehydrogenase (lipoamide) alpha 1; Pyruvate 
dehydrogenase, El-alpha polypeptide- 1 [Homo sapiens] 
/org=Homo sapiens /taxon=9606 /dataset=nraa /length=390 
Length =390 

Score = 793 bits (2025), Expect =0.0 

Identities = 389/397 (97%), Positives = 389/397 (97%) 

Frame = +3 

Query: 90 MRKMLAAVSRVLSGASQKPASRVLVASRNFANDATFEIKKCDLHRLEEGPPVTTVLTRED 269 

MRKMLAAVSRVLSGASQKPASRVTjVASRNFANDATFEIKKCDLHRLEEGPPVTTVLTRED 
Sbjct: 1 MRKMLAAVSRVLSGASQKPASRVLVASRNFANDATFEIKKCDLHRLEEGPPVTTVLTRED 60 

Query: 270 GLKYYRMMQTVRRMELKADQLYKQKI I RGFCHLCDGQFLLPLTQEACCVGLEAGINPTDH 449 

GLKYYRMMQTVRRMELKADQLYKQKI I RGFCHLCDGQ EACCVGLEAGINPTDH 
Sbjct: 61 GLKYYRMMQTVRRMELKADQLYKQKI IRGFCHLCDGQ EACCVGLEAGINPTDH 113 
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Query: 450 LI TAYRAHGFTFTRGLS VRE I LAELTGRKGGCAKAKGGSMHMYAKNFYGGNG I VGAQVPL 629 

LITAYRAHGFTFTRGLSVREI LAELTGRKGGCAK KGGSMHMYAKNFYGGNG I VGAQVPL 
Sbjct: 114 LI TAYRAHGFTFTRGLS VRE I LAELTGRKGGCAKGKGGSMHMYAKNFYGGNG I VGAQVPL 173 

Query: 630 GAGI ALACKYNGKDEVCLTLYGDGAANQGQI FEAYNMAALWKLPCI FI CENNRYGMGTSV 809 

GAGI ALACKYNGKDEVCLTLYGDGAANQGQI FEAYNMAALWKLPCI FI CENNRYGMGTSV 
Sbjct: 174 GAGI ALACKYNGKDEVCLTLYGDGAANQGQI FEAYNMAALWKLPCI FI CENNRYGMGTSV 233 

Query: 810 ERAAASTDYYKRGDFI PGLRVDGMDI LCVREATRFAAAYCRSGKGP I LMELQTYRYHGHS 989 

ERAAASTDYYKRGDFI PGLRVDGMDI LCVREATRFAAAYCRSGKGPI LMELQTYRYHGHS 
Sbjct: 234 ERAAASTDYYKRGDFI PGLRVDGMDI LCVREATRFAAAYCRSGKGPI LMELQTYRYHGHS 293 

Query: 990 MSDPGVSYRTREEIQEVRSKSDPIMLLKDRMVNSNLASVEELKEIDVEVRKEIEDAAQFA 1169 

MSDPGVS YRTREE I QEVRS KSDPI MLLKDRMVNSNLAS VEELKE I DVEVRKE I EDAAQFA 
Sbjct: 294 MSDPGVSYRTREE I QEVRS KSDP I MLLKDRMVNSNLASVEELKE I DVEVRKE I EDAAQFA 353 

Query: 1170 TADPEPPLEELGYHIYSSDPPFEVRGANQWIKFKSVS 1280 

TADPEPPLEELGYHIYSSDPPFEVRGANQWIKFKSVS 
Sbjct: 354 TADPEPPLEELGYHIYSSDPPFEVRGANQWIKFKSVS 390 (SEQ ID NO: 5) 



>CRA| 18000004938217 /altid=gi | 6679261 /def =ref | NP_032836 . 1 | pyruvate 
dehydrogenase Elalpha subunit [Mus musculus] /org=Mus 
musculus /taxon=10090 /dataset=nraa /length=390 
Length =390 

Score = 783 bits (1999), Expect = 0.0 

Identities = 382/397 (96%), Positives = 387/397 (97%) 

Frame = +3 

Query: 90 MRKMLAAVSRVLSGASQKPASRVLVASRNFANDATFEIKKCDLHRLEEGPPVTTVLTRED 269 

MRKMLAAVSRVL+G++QKPASRVLVASRNFANDATFEIKKCDLHRLEEGPPVTTVLTRED 
Sbjct: 1 MRKMLAAVSRVLAGS AQKPAS RVLVAS RNFANDATFE I KKCDLHRLEEGPPVTTVLTRED 60 

Query: 270 GLKYYRMMQTVRRMELKADQLYKQKI I RGFCHLCDGQFLLPLTQEACCVGLEAGINPTDH 449 

GLKYYRMMQTVRRMELKADQLYKQKI IRGFCHLCDGQ EACCVGLEAGINPTDH 
Sbjct: 61 GLKYYRMMQTVRRMELKADQLYKQKI IRGFCHLCDGQ EACCVGLEAGINPTDH 113 

Query: 450 LI TAYRAHGFTFTRGLS VRE I LAELTGRKGGCAKAKGGSMHMYAKNFYGGNGI VGAQVPL 629 

LITAYRAHGFTFTRGL VR I LAELTGR+GGCAK KGGSMHMYAKNFYGGNG I VGAQVPL 
Sbjct: 114 LI TAYRAHGFTFTRGLPVRAI LAELTGRRGGCAKGKGGSMHMYAKNFYGGNG I VGAQVPL 173 

Query: 630 GAGIALACKYNGKDEVCLTLYGDGAANQGQI FEAYNMAALWKLPCI FI CENNRYGMGTSV 809 

GAGI ALACKYNGKDEVCLTLYGDGAANQGQI FEAYNMAALWKLPCI FI CENNRYGMGTSV 
Sbjct: 174 GAGIALACKYNGKDEVCLTLYGDGAANQGQI FEAYNMAALWKLPCI FI CENNRYGMGTSV 233 

Query: 810 ERAAASTDYYKRGDFI PGLRVDGMDILCVREATRFAAAYCRSGKGPILMELQTYRYHGHS 989 

ERAAASTDYYKRGDFI PGLRVDGMDI LCVREAT+ FAAAYCRSGKGP I LMELQTYRYHGHS 
Sb j c t : 234 ERAAASTDYYKRGDFI PGLRVDGMDI LCVREATKFAAAYCRSGKG PI LMELQTYRYHGHS 293 

Query: 990 MSDPGVS YRTREE I QEVRS KSDP I MLLKDRMVNSNLASVEELKE I DVEVRKE I EDAAQFA 1169 

MSDPGVSYRTREE I QEVRS KSDP I MLLKDRMVNSNLASVEELKE I DVEVRKE I EDAAQFA 
Sbjct: 294 MSDPGVSYRTREE IQEVT^SKSDPIMLLKDRMVNSNLASVEELKE I DVEVRKE I EDAAQFA 353 

Query: 1170 TADPEPPLEELGYHIYSSDPPFEVRGANQWIKFKSVS 1280 

TADPEPPLEELGYHIYSSDPPFEVRGANQWIKFKSVS 
Sbjct: 354 TADPEPPLEELGYHIYSSDPPFEVRGANQWIKFKSVS 390 (SEQ ID NO:6) 
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Hmmer search results (Pf am) : 

Model Description 

PF00676 Dehydrogenase El component 
PF01579 Domain of unknown function 

Parsed for domains: 



4e-176 1 
2.3 1 



Model 



Domain seq-f seq-t 



hmm-f hmm-t 



score E-value 
3.0 2.3 
598.5 4e-176 



PF01579 
PF00676 



1/1 28 46 

1/1 66 369 



153 173 .] 
1 327 [] 
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1 AGTTGTTCCT TCTAACCCAT TGATTTGTTC AATCATGTAT TTAAGTAGGA 
51 CCTATATTTT ACTTGTTCCT TGCTATATCT TCAGTGTGTA GTACAGTGTC 
101 TGACACAAAA TCGGTGCTCA ATAATAGGTG TTGGATGAAT GAGCAAATGA 
151 ATGAATGAAT TCATATTCAT ATGGCCTACA GAGTTCCCGT ACATGCACAA 
201 CCAATATCAC CACCCCGTGG AGATGACTCC CAAATTAATA TTTTTAGCAA 
251 ATGTTCCAGA CTTACAACTC CAACTTCCCG GGGGACATCT TCAGATAGCT 
301 GTGCCACTGC CACCACCAGG TCAACATGTC CCAAACCATT CAGACCAGCT 
351 TTTTCTCCTG AGCTGGACAT CTGGCCTCCA ACCTTTTCAT TCTCTTTTAC 
401 CTTTCATATT CTATCAGCAG CAGCAGCTGC TGAAATCATA CCATGCAAGT 
451 TTCTCACGTC CATCTCTGCC TTTTAATGGC GCCCTCTCAC TCCTTTAAGA 
501 AGTTTTCTTC CACTGCAACA CGATCTCTCA GTCCAGAGTC TGGCCCAGTG 
551 CCCAAATTAT TTCTCTAGCT ATGCTGAGAG CTGGTCATGC TTTGAACTTC 
601 TGCTTTGAAT ACTTTCAGTG ACACTGGGAG AGAATTATCT CATTGGACCA 
651 TTGTCATTGT TAGAAAATTC ATTGTTATGC TGAAATGAAA TGATTTTATT 
701 CACACACACA CACACACACA CACAAAATAG CTCTTCCTCC TGGAACATGA 
751 CTGGCCTGAA AATGTGTGAA GACATATCCA ATCCTCTCTG GTTTTACTGT 
801 TCATCCAATT TTCTGTTCTC CTCCTGGCAG GAGGATTATA TTTCACCTTG 
851 TGGAACTCAG ACATGGTCGG GTAACTAGCT CTGGTCCGTG AAAATTGAGA 
901 GGAAGTGACA TGTGTCACTT CTGGGCAGAA GCTTTGAGAG CCGGTTTAAA 
951 TGATCCCTTT TCTCTTCATC CATGAGACAA GCTAAGTTCC AGAGAGAGGG 
1001 TGCCACGCTG TGAGGGACCT GTGTTACGAG TACGATGGCT CGCGTCACTT 
1051 CAAATTCTTG AAATCACTGA AATTTGGAGG TCAGTTGTTA CATCATAACC 
1101 CAGCCAATTC TAGTTAGCCT GTTTTCTTCC TAACTTCTTT AATCGTTCTT 
1151 CATAAGTCAC AATCGCAGCC CCTCACCGTT CTGACCACTG TCCCCTGGAT 
1201 TCCACTCAGT TTACTCATTA TCCCCCTTAA AATGTGGAGC CCAAATCTGA 
1251 ACCCGGAACC CCAGGTGCAA TCCCACTAGG ACACAACACA ATGGGTTCCT 
1301 GAGCCCTTTG ATCCTCTGAA TAGAGCCCCT TGTTGCTTTG GTGTTTTGTC 
1351 TCTGTGTGTG CTTTTATCAT CGGCTGAGCC ACGCTGTTAA CTCGCAGTGA 
1401 GCCTGTGAAC CAATAACTAG AGAAAAAAGA TTTTTCCCAT TGTCCTCTCG 
1451 ACATATATTG GGAAACAAAT TTTTTGATCC GCGTTCAAGT AGACAGGGCA 
1501 GAACTGTCCA ACTGCTACGT GATCTTTTAA AGACAAAGTT AGTGGCAGAC 
1551 CATTTACAGA AACCAGATGT TCTGTCTTTT GGCTCTGAGC ATGCTGCTAA 
1601 TCTTCATCAT CTAGTGTACT GAACGAGATG TACTGAACGA GGGCTGCAGA 
1651 GCTGCAGCAC CGGCAGGAGT AGGCGCTCGG TAGGACGGGG CCTGCACAAC 
1701 CTCCCCGGTA GTCAGCAGAG CGGAATCTAG GAAGGCTCCT TTCCCGCGGC 
1751 GCCCTGGAGG CGGGGGCCCC ACCTTCCCAC GCAGGCGCTA TCAAGCCCCG 
1801 CCTCCTCACC CGCCCGCGGC GTGGCGTCGG AAAGAGCCCT CAGCCCCTCC 
1851 CTCTCTGGCG CTGATACCCA ATGGGCAGCC TCAGGCCTTT AGCGGGGGCG 
1901 GGGCACCCCC TGGACGCCGT TCTGGTTGGC CCGCGGCCCG GCGCAGCGCA 
1951 TGACGTTATT ACGACTCTGT CACGCCGCGG TGCGACTGAG GCGTGGCGTC 
2001 TGCTGGGGCA CCTGAAGGAG ACTTGGGGGC ACCCGCGTCG TGCCTCCTGG 
2051 GTTGTGAGGA GTCGCCGCTG CCGCCACTGC CTGTGCTTCA TGAGGAAGAT 
2101 GCTCGCCGCC GTCTCCCGCG TGCTGTCTGG CGCTTCTCAG AAGCCGGTGA 
2151 GACCTCCCGG GCGGGCCGGG ATGGGGCGCG AGTGGGGCTG AGGCGGGGCC 
2201 GGAGGGCAGG GCGGGCCAGG CCGGGCCACC CAGAGCGGGG TGGAAGGCGC 
2251 CAGGGGAGCC GGGGAGCCTT TACTTCGCCT CCGCGCCCTG CATTCCGTTC 
2301 CTGGCCTCGG GAGAAGCGGC ACGGACCGGG ATCACGCCAA GGTCCGTGTG 
2351 AACTTCCCCC TTCTCGACAC CCACCTCCCG CCCCCGGGCC CAGCTGTGCG 
2401 CCAGGCGAAG TCGGTGTGCT CAAGAGGTGC CTGTTGGGTT ACAGGACACG 
2451 GAAAGGGTGG CCTCGGCCTC CTTCGAGTCT CCAATTGACC CCACTCATTT 
2501 CGGATCTTCT AACTTAATTT CTCTTGACCG AGAGGCTTTG TAATAGCGTA 
2551 GAATCTGGAG ACAGGGTGGC TTCGTTCAAA CAGCACCCTC ACCATTGACT 
2601 AGCCCTGTGA CCTTGAGCAA GTTTTTAAAC GTCCCGGGGA CCCGGTTTCC 
2651 TAAAATGTTT GCTCGAAGTG GAGTTAATCT CTAAATGGAG ATAAGAGTTA 
2701 TCTCTGAAAT GTTATCGGTT ATTAAAATGT TATCAGTTAA CTCTAAAATG 
2751 GAGATAATAA GAGTCCCCAC CTCTTGGGGT TGTCTTGAGG ATTCAACGAG 
2801 TGACACGTGT GGAAACGATT CCAAATAGCA CCTGGCACAT AATCGATAAC 
2851 ATGTGTGTTG AATAGTGTTA TTTATTGAGT CTCCAGTTCG GTATACATTT 
2901 CTTGAACACC TGTGCTCAGT TCTGAGGCGG GTTCACAGAA GGTCAGCCTC 
2951 TTCAGAAACA AACTTCCTCC TCTTCCCTCT CCCTCAACAT CTGAGCTTTT 
3001 CTTGGCAGTG AGTTCAGGAG CGCCGAAGCA GAACTCAGAG GACGCTGCCC 
3051 TCCCCTCCCC TTACCTACAC ATTCTTAGGG TACAAGTAGC TAAAGCAAAG 
3101 AGCAACGATG CTTGAGGGGT GGGGGGTAGA GTTTAGCACT ATTTCATGGC 
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3151 CTCAGCATTT AGAGGTGCCT AACACCTGAG CTAGCATTCT GACCCCCCTA 
3201 GGCACAGTGA GGTCGTGTTA ATTGGTGTAA CTGCAGGCCT CGGGATTCTG 
3251 GTATTTCCCC CAGGACTTGA TACCGCTCTA CTTAGTACAG GCAAGAGATT 
3301 GTCAAAAGGT AAAGAGGTAT GCCCCTCTAG GAATCCTGTT GCCTAAAATA 
3351 ATGACAAAAC TGCCGGGTGC GGTGCTCAGG CCTGTAATCC CAGCATTTTG 
3401 GGAGGCTGAG GCAGGTGGAT CACCTGAAGG TCAGAAGTTC GAGATCAGCC 
3451 TGGCCAACAT GGTGAAACCC CGTCTCTACT AAAAATACAA AATTAGCCGG 
3501 TCGTGGTGGC GGGCTCCTGT AATCCCAGCT ACTCGGGAGG CTGAGGCGGG 
3551 AGAATAGCCT GAACCCGGGA GCGGAGTTTG CAGTGAGCGG AGATCGTGCC 
3601 ATTGCACTAC GGCCTGGGCG ACAAGAAGCA AGAACTCCGT ATTTTAAAAA 
3651 AAAAAAAAAA AAAAAAAAAA AAAAGCGTTC CCTTTAGGGA TATCTGTGGG 
3701 TAGAGGGCTG TACCGGTAGT TACGGGCTCA GAAACATCCT TCCTTTAGGC 
3751 ACCTGATGTA GGTTTTCTTC TTCTTCTGCA AGTCAGGTTC ATTGTTTCCT 
3801 GTATCAGTTT GCAGGGTCCC CCCCCCCCCG CCACCTTACA GTAGGAAGAA 
3851 AATTGAGTTC CAGATATGAA GTCACCTTTG AAAGTGCCCA GGTATCTTTC 
3901 CACTTGGTGG TGTAAACTCT TCAGATAATT AGAAGTTTTC TGTGTCACTC 
3951 AACTTGTCAT GGACTAATTT AGGAAACATT CCTGAAGCTT TTAAGGATAG 
4 001 AACTAAAAGT TTCACTTTTA TTTTTTTAAA GGGTGGAATA ATAAACTAAC 
4051 GTGTTGACTC TTTGTATTTT GTAATTCTTC ATACTTATGG ATGTCTTTTT 
4101 ACTTAACTAT AAGTAACAAA ATAGATCAAC GTTTTAGTTT TTTTATATTA 
4151 TACATGTAAA AAGACATTTT GCATATAAGC CTTTCACAAA AATCTTGACA 
4201 GTAAACAATA AGCAGTGGCT CACCCAAATT AGGCAGACTT ACTGCACTAG 
4251 ACTCCTACCA TCTGTGTGAT ACTCCATGAA GGGAGGGAGA AGGGGAGGGA 
4301 GAAGGGTAGG CAGCTGGTCT GATGGCTGTG ACACAAGATA ATCCCCTTAA 
4351 CCTCCCAAGA CGCTGTGTGT TTTTTCCTTT TTTATTCTCC CTGGTTTACT 
4401 TTCGTTTTGT TTGAGACAGG GTCTCTGTGT CACCCAGGCT GGAGTGCAGT 
4451 AGCAGGACAG CTCACTGCAG CCTTAGCCTG CTGGGCTCAA GCGATCCTCC 
4501 TGCCTTAGCC TCCTGAGTAG CTGGGAACAC AGGCATGTGC CACCACCACA 
4551 CCCAGCCAAT TAAAAAAATT TTTTTTTTAC TAGAGACATG GTCTTGCTAC 
4601 GTTGCCCAGT CTGGTCTCCA TCTCCAGGCT CAAGCAGTCC TCCCACCTCG 
4651 GCCTCCCAAA GTGCTGGGAT TACTCTCACT CTCTTAAAAC CAGGCAGGTA 
4701 GGGAGATTTA TCTCAGGCTT AAAGATTGCC ATTGTCTCAT CAAAGAGTGT 
4751 TTGGTGTGAA ACTTTGAAAT GAATATCAAG ATTGTGTTTT TATTTTTGAA 
4801 TAAGGTTTAT AGTTTTCATA GTTCTTATTT CATGGAAGAA GATTGAATGC 
4851 ATTTAAAATG TTATTTTATT GTTTGCATTT CTGTATGGCT CCTTTTGTGA 
4 901 GATCTTTACT AGCAATGTTT TGGCTTTATA AGTGGTAGGT AAGAGTTTTA 
4951 ATTTACACTG TTAGAATCTG GAATTTTTGA AACGTTTTTC CTCTTTCACA 
5001 TGAATGGTTC CTATGTATTT AGGAAGTTAA AGTTTTACTT TTTTTTAATT 
5051 AATTTTTTTT TTTAGGCTGG AATGCAGTGG CACAGTCATA GCTCACTGTA 
5101 GCCTCAGGTG TGTGCCACCA TACCTGACTA ATTTTTTAAT ATTTATTTTT 
5151 GTAGAGATGA GAGTCTCATG TTGCCCAGGC TGGCTTTGAA CTCCTGGCTT 
5201 CAAGTGGTCC TCCCACCCTG GCCTCCCAAA GTGCTGGGGA TTATAGGTGT 
5251 GAGCCATCAT GCCCGGCCTA GTTTTTATTT TTTAAAATTT GAGTGGGTTG 
5301 TTCGTGGTCT CTGTCAGAGA GGAATCCCAT TTAACAGAGA ATCTTTTTAT 
5351 GGCTCTCCAG AGAAAATGAA TGGTAAACTT ATCTTTTCAA CAAGCTCTCA 
5401 CTCAGAAATG ATACACACAC ACTTCTGATA GGACTTTTAG CTTCTTTAAC 
5451 TTTGTTCCTT TCACTCATAT CAGTGGTTCT TATTTTTGAG ATACACAGTA 
5501 ATGAAGCCAT GGGAGAAAGT ATCTAAGTAG CTTTCTGGCA GTCCTAATCT 
5551 TTGCAGGCGC AAGATTACAG GCGCATGCCA CAGCACTGGG CCCCTTCTTG 
5601 CTCTTTATTG TATAGCATTA TCCTGCCTCA TTGTTTCAAC TCTAGGATTG 
5651 AGAAAGAAGT TACCTTTTCT CTGTTACTGT CGCCTGGCTG GTTTGGACTC 
5701 CTGCCTTCCA AAAACTGCAG TTTCTGTAGT TGTATTTGGA AATTTATTTC 
5751 ACAATACAAT AAATTTCTGG CCCCACAAAA TATTTATTAA CTGCCAAGAA 
5801 TAACACATCT GTTTGATTGC TAAATATAAC CATTGATTTG CTGTTTCACC 
5851 TTCTCTCAGC TTTACTTCTT CCCAAATTCC TAAATTTCCT TCACTTTTTC 
5901 TGAGATACAT TAGTGGACTG TCTCTGCCTG TAAGTTAACT GAAACACTGA 
5951 TTCCTAGTAT TTCAGTTGTT TTCCTCCAGC ACTGTCATTG TCTGTGTTTG 
6001 TTGGCTTTGT CCAATAATGG TCTATTGAGG GGTGAAGATA TACGTAATTA 
6051 GCTTTCTGCC TATTGGCTTG TACACTCCAG GGTATACTTG GCAGATCAGT 
6101 CTTAACTCTT CTCACCAAGA TCAGTCCAGT GCTGGATTAG GTAAGGTATG 
6151 AACACATCAG ATGTGCTTTT TATGGAGAAA TCATGTTGGT TTACACGTCA 
6201 GTGTGTGAGA ATGTGGCAGA AGGGAGCTAA AATAGTATGA TAATACTACT 
6251 GGATAAATTT TGTGGTCTAA CCTAAACCTT AGCCATTACA TAGAATACTT 
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6301 TTGCTGTGAG CAGGTTTGCT 
6351 CACCCCCCGC CTCCAAGCTT 
6401 CATCAAGAGA ACAGTGTTTC 
6451 AGCCTGTCCA AGATTTAGGC 
6501 CTCAAGCCCA TTGTGTTTAA 
6551 TCTTGTCTCC CTAAAGTATT 
6601 TTTTCTTACA GATCCAATTT 
6651 GATTAAATCA TAGTTTTATT 
6701 ACAGTATCTG GTTATGACTA 
6751 CTCATGCCTG CAATCCCAGT 
6801 TGAGGTCAGG AGTTGGAGAC 
6851 CTACTAAAAA TATAAAAATT 
6901 CCCAGCTACT CGGGAGACTG 
6951 CGGAGGTTGC AGTGAGCCGA 
7001 TAAAAGCAAA ACTCCGTCTC 
7051 AGACACAGCT CACAGATGAT 
7101 TTCTCACTAT AATTTCTTTG 
7151 TCTTTGCGCT GCTTGCCTTC 
7201 AAGCATGGAT TCATTTTGGA 
7251 GGTGGAAGAG AAGCCTATGA 
7301 TTCATTCTGA TAACTGATTA 
7351 AAGTACTGAT TTGTTTGTAT 
7401 GGTAGCATCC CGTAATTTTG 
7451 GAGTGTTTTA CTTTGTTAAT 
7501 GTTTTACCTT TAGAATAGAA 
7551 TCTTCTAAAT GTCCAGGATC 
7601 GGAATGTAAT TCAGTGAAAT 
7651 TTGAAGGTTC TTTCAACCTA 
7701 GAAAATCATC TTAGAAAGAT 
7751 AGTGAGGTTT ATGGATTTAA 
7801 GTGCTCCTTT GCAGTTTTTC 
7851 TCTTAGAGTG GTCAACAGTG 
7901 TACCTTATTC CATTTCAGAA 
7951 TCCTGTCACA ACAGTGCTCA 
8001 TGATGCAGAC TGTACGCCGA 
8051 CAGAAAATTA TTCGTGGTTT 
8101 TAGGTTTGTG GTGGAACTGT 
8151 TTATTGGGCT TTACCCTGCC 
8201 TGTAATTTTC TTTTATTTAT 
8251 CCCAGGTTAG AGTACAGTGG 
8301 AGTACAGTGT GATCTTGGCT 
8351 AATCCTCCCA CCTCAGCCTC 
8401 ACACACCCAG CTAATTTTTG 
8451 TTGCCCAGGC TAGTCTCAAA 
8501 GCCTCCCAAG GTGCTAGGAT 
8551 GCCAGTATTT TAGAATTAAA 
8601 TAGGCTAAAG AATTCATTCA 
8651 TCTTAGAGTT TTACTATTTT 
8701 TGTATATAAT TTGGAGAGAA 
8751 TAGGTCAAGG GTTAGAGTCA 
8801 AATGAGTGAA GAAGGCTAGG 
8851 CATGCTGAAC TGGCACCAGC 
8901 GTTCCAGCCA GTTGGTGCCA 
8951 TCCATGTCTA AGAGAAGGCG 
9001 TGTTTTAATG TTGACATCTG 
9051 GGAGAAAGGA AGGAAGTGGC 
9101 GGTTTGGGCC TTCCACTCTG 
9151 GTTATTAATG ACAGGGTCTA 
9201 AACGTTTTTA TTTAGAAACA 
9251 CTCTAACACA GGAAGCTTGC 
9301 ACAGACCATC TCATCACAGC 
9351 GGGCCTTTCC GTCCGAGAAA 
9401 TTTACAGAAA GGGGAAATGA 



CAGTTGTAAA ACTGGAAAGG AATCATTTCT 
TTTACCTCCA AACAGTGACA GCCACCCAAA 
AGAGAACATT TCTACTGGGG CTTCAGGAGG 
TGTTCAAATT ATAAATTATA AAACAGCTGG 
GTCAGAGAGT GCTAAGTATC TTTTCTTTTG 
TATCTCATAC TTCAATCAAT TTAAAATATT 
GATAGAAGAG TCAAGTTTGC CTAGAGTGGA 
TGAAGTATAA TTTTGGCTTG CTCAAAATGA 
AGAATGGCAT GAAAAGGCCA GACGCAGTGG 
ACTTTGGGAG GCCAAGGCAG GTGGATCACC 
CAGCCTGGCC AACATGGTGA AACCCCATCT 
AGCCGGGCCG TGGTGGTGGG CACCTGTAAT 
AGACAGGAGA AATCACTTGA ACCCGGGAAG 
GATCGCACCA CTGCACTCCA GCCTGGGTGA 
AAAACAAACA AACAAAAGAA TGGCATAAAC 
CTAGTCTCTT TAGCCACTAA TTTCATTATA 
AAAACAAAGG ATGGGTTTGT TTTTTGCCCC 
AGATGCGGGA TAATCCTGTT TCATTGGCCA 
GGCCAAGGAA GATGCAAACA CAGTGCACAG 
ATATGTTGGG GCTTATTAAA TTTCCATAAC 
TTATACTTTC CAAAATAGCT GACAATTAAA 
ATTTTTGTCT TTTAAGGCAA GCAGAGTGCT 
CAAATGATGC TACATTTGAA ATTAAGGTAA 
AATTTTTTCA CAGGTACACT CTGATATACA 
CATCTTGATG TTCATGATTA GTCATCATTT 
AGAAGTTCAG AGAAGCTTAT TCAAAAGTTT 
ATTTGAATAA GAAGAGTCTT AGTTGTTTCT 
TAACTCAGTT GGCTTCTAGG GGCTTTCAGT 
TTCCTTCCCC CAAGCCCCAT CTCATTGCAC 
GGAACAGAGG CGATATGAAG CATTACTGAT 
AAGTTCAATA TTATTTGCAA TGGAGTTAGA 
TTTGCAATGT AGTATGTGGA GGATAATAAC 
ATGTGACCTT CACCGGCTGG AAGAAGGCCC 
CCAGGGAGGA TGGGCTCAAA TACTACAGGA 
ATGGAGTTGA AAGCAGATCA GCTGTATAAA 
CTGTCACTTG TGTGATGGTC AGGTGAGTGG 
GTTATTTAGG TACTGAAGTA TGGCTTGTAC 
ATATGTATCA GAAGAGTTTG AGGCTGGTAA 
TTATTTTTTT GAGACAGTCT CTCTCTGTCG 
TGATCTTGGC TCACTGCAGC CTCTGGTTAG 
CACTGCAGCC TCTGTCCACT GGGCTCAAGC 
CCGAGTATGT GGGACCACAG GTGCACACCA 
TATTTTTTGG AGATACGGGG TTTCACTATG 
CTTCTGGGCT CAAGTGGTCC GCCCACCTTG 
TACAGGCGTG AGCCACTGTG CCTGGCTGAA 
AAGTAGAATG CCAAAACCTG CTATGAAGCT 
CACATAACAT TGCCAGTTTT CTGTACCTGT 
AAAACTTTCT GGCACTATGA TCGCCTGTAC 
AGGATTAGTT TGTTTTTTGT TTTGTGGGCT 
AATACCTACA AGGGCCAGCC AGGTAGAATA 
TATACAAAAC AGAAAATGGT GACAGGGACT 
ATGCCCTACC CAGAGGAATG CCATGACTTG 
TGTGGAAATC AGGGGTAATG TTTCCTGTTT 
GAAGTCTGGA TTTTCATGTG AAATTCCCAG 
ATGTAGGCTT TTATTTTAGG TCATCATACA 
ACATGTGTGG GTTGCCAGTT TATTGCTTCT 
TATTTTGGGG GAAAATAGCT ACTTTCTCTG 
CTAGCCCACA TATTTCACTG TGGTCTAGGA 
TGTATCATAT TGCCTCATAG TTTCTCCTTC 
TGTGTGGGCC TGGAGGCCGG CATCAACCCC 
CTACCGGGCT CACGGCTTTA CTTTCACCCG 
TTCTCGCAGA GCTTACAGGT TTGCTGTTGA 
GTGGATTAAG TTTTTAAATA TCTGTGCATT 
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AAGATGCTAT 
TAACCCTCTC 
ACTGACCATT 
GTTGGTGCTA 
AAGAAGTCTA 
ATGCAATGTA 
CCCGTGACTA 
GACTGTGTCT 
CTCACTGCAA 
TCCTGAGTAG 
TGTATTTTTA 
AACTCCTGAC 
GGGTTACAGG 
AGAGTCTCGC 
CCCAAAATGC 
TCCTCCTTTT 
GGCTGGAGTG 
CGCCCCCCGG 
GATTATAGGT 
GAGACCAGGT 
AGGTGATCCA 
GCCGCCGTGC 
ATATCCAGGG 
TACTCAAGTC 
AATATTCTAT 
TGTACCTGTG 
CTAGTATAGG 
AAAGTCCAGG 
AGAGAGTAGG 
ACAAGGAGAA 
ACTGGCCTCT 
CGGTTTGGTT 
TGCCATTTCC 
ATGCACATGT 
GCAGGTAGTC 
CTTTGTCTTG 
GTACAGCCAT 
TGTTGGACCC 
GATGTTGTAG 
TATTTTGCCT 
CTAGGAGCAG 
CATCTAGGTT 
GTGATGCAAT 
TTTGGGGGCT 
TTGAGAGATG 
GTCTGCTGTT 
TCCCCTTCCT 
TCATTTTGGG 
AGCACAAAAT 
TTCTATTATG 
AAATGGAGTC 
CTCACTGCAA 
CCCGAGTAGC 
GTATTTTTAG 
TCTGTGGACC 
ACCACGCCCG 
GTGCTTACTG 
TTATTGAAAA 
GAGTTTGTAG 
AAATGCAGGG 
TTCTCGGTTG 
GCCTGTAAGT 
TGGTGCTGCT 



TATGAGTTAA 
TCCTTTGGTG 
TGTGAAGTTC 
TAAAATCACA 
GCAGTCATAG 
AATTCTAGAA 
TTTGTTTGTT 
CACTCCGTTG 
CCTCCACCTC 
CTGGGACTAC 
GTAGAGATGG 
CTCAGGTGAT 
CGTGAGCCAC 
TTTGTTGCCC 
TAGGATTACA 

GCTGGAGTGA 
GTTCAAGCAA 
GCCCAACCAC 
TTCACCATGT 
CCCTCTTCGG 
CCGGCCCTCC 
GATTGGTTCT 
CCAAAGTCAG 
TTTCAATACA 
CAGTTCAAAC 
AGAGGTCCTA 
GTATCATCTG 
AAAAGCTGGA 
ACATGTGTTT 
GTGTTCTAAT 
TGCTTTGTAG 
AGGACGAAAA 
ATGCCAAGAA 
AAGGACGAGG 
AAAAACCTTT 
GTGCTGCACA 
ATAAGATTAT 
TGGCACAACA 
ACAGGATTCA 
TAGGCTCTAC 
CGTGCATTAC 
TCTGAGAATA 
TGGTTTGCTT 
AGTAGATTTG 
TTATAAGTAA 
ATTTATATTT 
GTCCTTGTTC 
TATATCCGGA 
TGTTGTTTTT 
TTGCTCAGCC 
CCTCCACCCC 
TGGGAATATA 
TAGAGACGGG 
TCGTGATCTG 
GCCAGGTTTT 
GTTAAAATAG 
TATCGGAGGT 
CAGCAGTGTA 
CATTAATTAG 
TCCTTAATGT 
ATAATGGAAA 
AACCAGGTAA 



TATTTGTTAA 
CTCTGGTACT 
TCTGGCCCCT 
GTAGGTTTGG 
AAAGTAAGTT 
ATCTTCTTAA 
TTGGTGGTTT 
TCCAGGTGGT 
CCGGGTTCAA 
AGGCATGCAC 
GGTTTCAACA 
CCACCTGCCT 
CGCACCTGGC 
AGGCTGGAGT 
GGCGTGAGCC 
TTTTGAGACA 
AGTGGTATGA 
TTCTCCTGCC 
CACACCTGGC 
TGGCCAGGCT 
CCTCCCAAAA 
TTGACTCTTG 
AGGACCCTCG 
CCTTCCATAT 
TGTGTGGCTG 
CCTGTTCAAG 
AGATGTTTGT 
GATGGAACAT 
AAGGTTGAAG 
GCCTGGAGGG 
GGTTGAGCCT 
AGTTGGTTTG 
GGAGGTTGTG 
CTTCTACGGG 
ATTGTGTGCT 
CACAGCCCCA 
GTGACGCTTT 
AATGGAGCTG 
CATTACCTTT 
GTAGAGTCAC 
TATACAGCCT 
AGTATGGTGT 
TATCCCTGTT 
TTAAAGACCT 
GATGGTGATT 
GCAGGAACCT 
CAGTACATTA 
TCATCAGTTA 
CTGTTTCTTT 
TTTTAAACCT 
GCCCAGGCTG 
CGGGTTCAAG 
GTTACGTGCC 
GTTTCACCAT 
CCCAAAGTGC 
ATTTTTTAAC 
AACATAGTAT 
GGGATAAACA 
ATTTCTGTGT 
TATCTCCCCT 
TAGGTGCCCC 
AGATGAGGTC 
TTATGTCTCT 



AAATTTTAAG 
TCTGTTGTGC 
CAGGTAAAAG 
TTATCATTCA 
CGGTTGAAGC 
TATTCCCCTT 

GTGCAGTGGT 
GTGATTCTCA 
CACCACACCT 
TGTTGGCCAG 
TGGCCTCCCA 
CTGTTTTGTT 
GCAGTGGCCT 
ACTGTGCCCG 
GAGTTTCACT 
TTTTGGCTCA 
TCAGCCTCCT 
TAATTTCTGT 
GGTCTTGAAC 
TGTTAGGATT 
AACTATGGTT 
AGTATACAAA 
CTTCGGGTTT 
AAAAAAAATC 
GATTGAATAT 
AACTGGCCAG 
CTGAAGGAAA 
CACATGGAAC 
ACAGGTACTT 
CAGAGTACAT 
TTCTGCACAT 
CTAAAGGGAA 
GGCAATGGCA 
GCTTTAGATT 
GACAACTTTT 
GGTCAATGTC 
AAAAATTCCT 
TCTACGTTTA 
ATGCTGTGCA 
AGGTGTGCAG 
TCACATGACA 
GTTAAGTGAC 
AGTGCTTCAT 
TATAATGTTT 
CTAGCAGTGG 
ATTGCTTTAT 
GTGAATGATG 
TCCTTTCTAA 
AGGTTTTATT 
GAGCAGTGGT 
CAATTCTCCT 
ACCATGCCCA 
CTTGTCCAGG 
TGGGATTACA 
TCTTGAATGC 
TTATATATTA 
GAGAGATAGG 
CAGATTCTGG 
CATGGATTTC 
TGGGCGCTGG 
TGCCTGACTT 
TAACTTCCCA 
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TCTTGAGTTA 
TTTAAAACAG 
AGCATGCCAG 
ACTCCATGGT 
TTCTTTGTCC 
TTTTTTTTGA 
GTGATCAGGG 
TGCCTCCACC 
GGCTAATTTT 
GCTGGTCTCC 
AAGTGTGCTG 
TTTTTGAGAC 
GCCTCAGCCT 
GTCCTCCTCC 
CTTTCACCCA 
CTGCAGCCTC 
GAGTAGCTAG 
ATTTTTAGTA 
TCTTGACCTC 
ACAGGCGTGA 
GTCCCTCTAT 
AATCCTCAAA 
GCATCCTGAG 
TGTGTATAAG 
ATTTAGTGTA 
AAAACCCAGA 
CTAAGTGACT 
TAGTGAAAGG 
AGACGACTGA 
ATTTGGGGTG 
GTGTATGTTC 
AGGAGGATCG 
TCGTGGGAGC 
TGGCCCTGGA 
CCTGAAGCTA 
GCATATATGA 
GTCGCCTAGT 
GGTACACAAA 
GGGTTGTAGC 
TGGGCTGTAC 
AAATCGCCTA 
GCGTGACTAT 
ATCCTACCGT 
CCTTTTAGGT 
AGCCATACCT 
CTTGTCAACT 
AAGAATTAAC 
TATATTAAGA 
TTTCCTTTTG 
GTAATCTCAG 
GCCTCAGCCT 
ACCATTTTTT 
ATGGTCTCGA 
GGCGTGAGCC 
AGAAATGTTA 
CTTTAGTGCT 
GTTGGAAGGA 
CCAGGAGTGA 
TGTGGTTCCT 
GATTGCTCTA 
TATATGGCGA 
AAAACAGTCT 
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TATTTTCAAA 
ATTGCTTATT 
GACTTATGGC 
CTTACGGGGG 
GCTTGGCAAA 
CAATACAGAC 
ATAATAGCGT 
GTAATCCCCA 
GATTCTTAGG 
CATATTGATC 
GAAACAAGTA 
GCCTGCTACT 
GCTCGGTTTT 
CCAGGCAGAG 
AAATGAAACC 
ACAACATGGC 
AATCGCTATG 
TTACTACAAG 
TGGTGGGGCC 
CCCTTGACGA 
GGTCATGTGA 
CTCCCCTGTT 
AGGCAACAAG 
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GTCTTTAATA 
AGGTGAAATA 
CAGATTAGAG 
CACATTCCTG 
GGCCCTGTGG 
CCATTCTCTT 
CTGAGTCCTA 
GTCTGTGGCC 
TGCTGGTTCA 
TTTTTTCCAG 
GTAGTTGGTT 
TCTCCTCCAC 
AGAAGAGGAG 
CAGCAGCTGT 
CCTTTTCGTA 
AGCTTTGTGG 
GAATGGGAAC 
AGAGGCGATT 
GGGGCCAAGG 
TCTTAGAAAC 
AAGTAAAATG 
TATTACCAGG 
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TTTACAGTTG 
GCAAGTCCTA 
ATTGACCTCT 
TGAAGGAGCT 
TAAAGGACCT 
GGATGTCCGG 
ACTCAGTTTC 
AGCACTCTGT 
CTCTGGCTAT 
TGTGTTCCTT 
TGTCACCTTC 
CACCCACCCC 
GCTTTCTGTG 
TAGAGATGAT 
ACTACTTCCA 
AAATTACCTT 
GTCTGTTGAG 
TCATTCCTGG 
CCAAGGCCAA 
ATTGGAGAGT 
GTTTGGGGCA 
TGGATGGAAT 
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AATTTCTAAA 
TGGCTAGCTC 
TAGCGTTGTT 
CACCTTTGCT 
CCCCACAACC 
GCTGGCAGTG 
TATGCTTCTC 
GAAGCCCTGT 
CCAGTGGGCC 
ACTGCTAGCA 
CTTAGTTGCA 
GCTTTCCCTC 
CTTTATGAAA 
GAAGCCTGGA 
GGGCCAGATA 
GTATTTTCAT 
AGAGCGGCAG 
GCTGAGAGTA 
GGGTATGTAC 
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GAAGTAGCAT 
AAATTTGGTT 
TCACAAGAGA 
CTACATCAGT 
TATTGCAAAA 
TCAAATTCGG 
TTGTTACCGA 
TCTAGAGGCT 
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TGGCCCCAAA 
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15751 TTCAGATGAT ATAGGCATAA GATACATTGG TTTTGCTGGC TGTGCTTCTT 
15801 TAGGGGGACT TAAGGGAGAA AGGCAAGGCA CATGGATTTC CTGCTTGGCG 
15851 CTCTGATGTC TCAAAGTCTA ATTATCACCA CACACACCAT CTCTGCTGTC 
15901 CCCACCCATG TAGTATACAG GAGCCCAAAT GGGTGGGACA AGTGACACTT 
15951 CTTTAGAACC TTACATCTAA ATCAAAGCAG CAAGCAAAAA CTTGGCCCCT 
16001 GTTGTCGGTA ATGCCAGGGA AGCCATGTGA CTCACCAGTG TACGGTTTTC 
16051 TAGAAAAGAC AGAAGCAGTT ATTACAGAAT GTTAGGCTGC GTTCTGGTAT 
16101 TTTGAAAGTA TAACAACAAC TCTGCCACGC CTATAGTGAC ATAAGCATTG 
16151 GTATGCCCCT TTGTTTCAGA AACACACTTC TGTATTTCAC CTCATTGGGA 
16201 CAATCCAACC CCATATCATG TTTCATCACG CCGTCCTTGC TCTACTGGAA 
16251 CTGCTCTTAC TGATCGATTA CTACTTTTCC CTCCCCATAG TTACCGTACA 
16301 CGAGAAGAAA TTCAGGAAGT AAGAAGTAAG AGTGACCCTA TTATGCTTCT 
16351 CAAGGACAGG ATGGTGAACA GCAATCTTGC CAGTGTGGAA GAACTAAAGG 
16401 TACAGTCACT TGTTCATGGT GGTTTGAAGG TTGGCTTTAA AAGTTGCCAC 
16451 CCCTGGGTGG CCACAGAGTT TGTGTGGGTT CCTCCAAGCC CAGAAAGTGA 
16501 TGTCCTGGGA CATAAATAGT TCCATAGTTC CAAAGTCCCT TGGGGTGGGG 
16551 GCTTTTCCTT TAGTTTCCTC TATTCAAAAT TGTATTACTC TTCAGATTTC 
16601 AGATTTTGGT GGACTGTGAA CCACCATCAC AGTGGCAAAG CCCCCACAGT 
16651 AGTATGGTTC TTTTTTCCTA AAAGTATACT GTGGATTTTT AATTCATAAA 
16701 ATAGATACAC CCTAGAAATC TGTNNNNNNN NNNNNNNNNN NNNNNNNNNN 
16751 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 
16801 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 
16851 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 
16901 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 
16951 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 
17001 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 
17051 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 
17101 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 
17151 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 
17201 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 
17251 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 
17301 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 
17351 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 
17401 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 
17451 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 
17501 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 
17551 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 
17601 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 
17651 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 
17701 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 
17751 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 
17801 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 
17851 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 
17901 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 
17951 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 
18001 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 
18051 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 
18101 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 
18151 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 
18201 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 
18251 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 
18301 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 

18351 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN (SEQ ID NO:3) 
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Context : 



DNA 

Position 
1785 



TCAAGTAGACAGGGCAGAACTGTCCAACTGCTACGTGATCTTTTAAAGACAAAGTTAGTG 
GCAGACCATTTACAGAAACCAGATGTTCTGTCTTTTGGCTCTGAGCATGCTGCTAATCTT 
CATCATCTAGTGTACTGAACGAGATGTACTGAACGAGGGCTGCAGAGCTGCAGCACCGGC 
AGGAGTAGGCGCTCGGTAGGACGGGGCCTGCACAACCTCCCCGGTAGTCAGCAGAGCGGA 
ATCTAGGAAGGCTCCTTTCCCGCGGCGCCCTGGAGGCGGGGGCCCCACCTTCCCACGCAG 
[G, T] 

CGCTATCAAGCCCCGCCTCCTCACCCGCCCGCGGCGTGGCGTCGGAAAGAGCCCTCAGCC 
CCTCCCTCTCTGGCGCTGATACCCAATGGGCAGCCTCAGGCCTTTAGCGGGGGCGGGGCA 
CCCCCTGGACGCCGTTCTGGTTGGCCCGCGGCCCGGCGCAGCGCATGACGTTATTACGAC 
TCTGTCACGCCGCGGTGCGACTGAGGCGTGGCGTCTGCTGGGGCACCTGAAGGAGACTTG 
GGGGCACCCGCGTCGTGCCTCCTGGGTTGTGAGGAGTCGCCGCTGCCGCCACTGCCTGTG 



1895 TGCTAATCTTCATCATCTAGTGTACTGAACGAGATGTACTGAACGAGGGCTGCAGAGCTG 
CAGCACCGGCAGGAGTAGGCGCTCGGTAGGACGGGGCCTGCACAACCTCCCCGGTAGTCA 
GCAGAGCGGAATCTAGGAAGGCTCCTTTCCCGCGGCGCCCTGGAGGCGGGGGCCCCACCT 
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TCCCACGCAGGCGCTATCAAGCCCCGCCTCCTCACCCGCCCGCGGCGTGGCGTCGGAAAG 

AGCCCTCAGCCCCTCCCTCTCTGGCGCTGATACCCAATGGGCAGCCTCAGGCCTTTAGCG 
[G, A] 

GGGCGGGGCACCCCCTGGACGCCGTTCTGGTTGGCCCGCGGCCCGGCGCAGCGCATGACG 
TTATTACGACTCTGTCACGCCGCGGTGCGACTGAGGCGTGGCGTCTGCTGGGGCACCTGA 
AGGAGACTTGGGGGCACCCGCGTCGTGCCTCCTGGGTTGTGAGGAGTCGCCGCTGCCGCC 
ACTGCCTGTGCTTCATGAGGAAGATGCTCGCCGCCGTCTCCCGCGTGCTGTCTGGCGCTT 
CTCAGAAGCCGGTGAGACCTCCCGGGCGGGCCGGGATGGGGCGCGAGTGGGGCTGAGGCG 

2118 GGCGTGGCGTCGGAAAGAGCCCTCAGCCCCTCCCTCTCTGGCGCTGATACCCAATGGGCA 
GCCTCAGGCCTTTAGCGGGGGCGGGGCACCCCCTGGACGCCGTTCTGGTTGGCCCGCGGC 
CCGGCGCAGCGCATGACGTTATTACGACTCTGTCACGCCGCGGTGCGACTGAGGCGTGGC 
GTCTGCTGGGGCACCTGAAGGAGACTTGGGGGCACCCGCGTCGTGCCTCCTGGGTTGTGA 
GGAGTCGCCGCTGCCGCCACTGCCTGTGCTTCATGAGGAAGATGCTCGCCGCCGTCTCCC 
[G,C] 

CGTGCTGTCTGGCGCTTCTCAGAAGCCGGTGAGACCTCCCGGGCGGGCCGGGATGGGGCG 
CGAGTGGGGCTGAGGCGGGGCCGGAGGGCAGGGCGGGCCAGGCCGGGCCACCCAGAGCGG 
GGTGGAAGGCGCCAGGGGAGCCGGGGAGCCTTTA 

5144 TGAATGCATTTAAAATGTTATTTTATTGTTTGCATTTCTGTATGGCTCCTTTTGTGAGAT 
CTTTACTAGCAATGTTTTGGCTTTATAAGTGGTAGGTAAGAGTTTTAATTTACACTGTTA 
GAATCTGGAATTTTTGAAACGTTTTTCCTCTTTCACATGAATGGTTCCTATGTATTTAGG 

AGTCATAGCTCACTGTAGCCTCAGGTGTGTGCCACCATACCTGACTAATTTTTTAATATT 
[T,C] 

ATTTTTGTAGAGATGAGAGTCTCATGTTGCCCAGGCTGGCTTTGAACTCCTGGCTTC/^AG 
TGGTCCTCCCACCCTGGCCTCCCAT^AGTGCTGGGGATTATAGGTGTGAGCCATCATGCCC 
GGCCTAGTTTTTATTTTTTAAAATTTGAGTGGGTTGTTCGTGGTCTCTGTCAGAGAGGAA 
TCCCATTTAACAGAGAATCTTTTTATGGCTCTCCAGAGAAAATGAATGGTAAACTTATCT 
TTTCAACAAGCTCTCACTCAGAAATGATACACACACACTTCTGATAGGACTTTTAGCTTC 

7932 AAGAGTCTTAGTTGTTTCTTTGAAGGTTCTTTCAACCTATAACTCAGTTGGCTTCTAGGG 
GCTTTCAGTGAAAATCATCTTAGAAAGATTTCCTTCCCCCAAGCCCCATCTCATTGCACA 
GTGAGGTTTATGGATTTAAGGAACAGAGGCGATATGAAGCATTACTGATGTGCTCCTTTG 
CAGTTTTTCAAGTTCAATATTATTTGCAATGGAGTTAGATCTTAGAGTGGTCAACAGTGT 
TTGCAATGTAGTATGTGGAGGATAATAACTACCTTATTCCATTTCAGAAATGTGACCTTC 
[A,G] 

CCGGCTGGAAGAAGGCCCTCCTGTCACAACAGTGCTCACCAGGGAGGATGGGCTCAAATA 
CTACAGGATGATGCAGACTGTACGCCGAATGGAGTTGAAAGCAGATCAGCTGTATAAACA 
GAAAATTATTCGTGGTTTCTGTCACTTGTGTGATGGTCAGGTGAGTGGTAGGTTTGTGGT 
GGAACTGTGTTATTTAGGTACTGAAGTATGGCTTGTACTTATTGGGCTTTACCCTGCCAT 
ATGTATCAGAAGAGTTTGAGGCTGGTAATGTAATTTTCTTTTATTTATTTATTTTTTTGA 

8015 AAAGATTTCCTTCCCCCAAGCCCCATCTCATTGCACAGTGAGGTTTATGGATTTAAGGAA 
CAGAGGCGATATGAAGCATTACTGATGTGCTCCTTTGCAGTTTTTCAAGTTCAATATTAT 
TTGCAATGGAGTTAGATCTTAGAGTGGTCAACAGTGTTTGCAATGTAGTATGTGGAGGAT 
AATAACTACCTTATTCCATTTCAGAAATGTGACCTTCACCGGCTGGAAGAAGGCCCTCCT 
GTCACAACAGTGCTCACCAGGGAGGATGGGCTCAAATACTACAGGATGATGCAGACTGTA 
[C,T] 

GCCGAATGGAGTTGAAAGCAGATCAGCTGTATAAACAGAAAATTATTCGTGGTTTCTGTC 
ACTTGTGTGATGGTCAGGTGAGTGGTAGGTTTGTGGTGGAACTGTGTTATTTAGGTACTG 
AAGTATGGCTTGTACTTATTGGGCTTTACCCTGCCATATGTATCAGAAGAGTTTGAGGCT 
GGTAATGTAATTTTCTTTTATTTATTTATTTTTTTGAGACAGTCTCTCTCTGTCGCCCAG 
GTTAGAGTACAGTGGTGATCTTGGCTCACTGCAGCCTCTGGTTAGAGTACAGTGTGATCT 

8063 GGATTTAAGGAACAGAGGCGATATGAAGCATTACTGATGTGCTCCTTTGCAGTTTTTCAA 

GTTCAATATTATTTGCAATGGAGTTAGATCTTAGAGTGGTCAACAGTGTTTGCAATGTAG 

TATGTGGAGGATAATAACTACCTTATTCCATTTCAGAAATGTGACCTTCACCGGCTGGAA 

GAAGGCCCTCCTGTCACAACAGTGCTCACCAGGGAGGATGGGCTCAAATACTACAGGATG 

ATGCAGACTGTACGCCGAATGGAGTTGAAAGCAGATCAGCTGTATAAACAGAAAATTATT 
[C,A] 

GTGGTTTCTGTCACTTGTGTGATGGTCAGGTGAGTGGTAGGTTTGTGGTGGAACTGTGTT 
ATTTAGGTACTGAAGTATGGCTTGTACTTATTGGGCTTTACCCTGCCATATGTATCAGAA 
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GAGTTTGAGGCTGGTAATGTAATTTTCTTTTATTTATTTATTTTTTTGAGACAGTCTCTC 
TCTGTCGCCCAGGTTAGAGTACAGTGGTGATCTTGGCTCACTGCAGCCTCTGGTTAGAGT 
ACAGTGTGATCTTGGCTCACTGCAGCCTCTGTCCACTGGGCTCAAGCAATCCTCCCACCT 

8066 TTTAAGGAACAGAGGCGATATGAAGCATTACTGATGTGCTCCTTTGCAGTTTTTCAAGTT 

CAATATTATTTGCAATGGAGTTAGATCTTAGAGTGGTCAACAGTGTTTGCAATGTAGTAT 

GTGGAGGATAATAACTACCTTATTCCATTTCAGAAATGTGACCTTCACCGGCTGGAAGAA 

GGCCCTCCTGTCACAACAGTGCTCACCAGGGAGGATGGGCTCAAATACTACAGGATGATG 

CAGACTGTACGCCGAATGGAGTTGAAAGCAGATCAGCTGTATAAACAGAAAATTATTCGT 
[G,A] 

GTTTCTGTCACTTGTGTGATGGTCAGGTGAGTGGTAGGTTTGTGGTGGAACTGTGTTATT 
TAGGTACTGAAGTATGGCTTGTACTTATTGGGCTTTACCCTGCCATATGTATCAGAAGAG 
TTTGAGGCTGGTAATGTAATTTTCTTTTATTTATTTATTTTTTTGAGACAGTCTCTCTCT 
GTCGCCCAGGTTAGAGTACAGTGGTGATCTTGGCTCACTGCAGCCTCTGGTTAGAGTACA 
GTGTGATCTTGGCTCACTGCAGCCTCTGTCCACTGGGCTCAAGCAATCCTCCCACCTCAG 

9307 AATGTTGACATCTGATGTAGGCTTTTATTTTAGGTCATCATACAGGAGAAAGGAAGGAAG 

TGGCACATGTGTGGGTTGCCAGTTTATTGCTTCTGGTTTGGGCCTTCCACTCTGTATTTT 

GGGGGAAAATAGCTACTTTCTCTGGTTATTAATGACAGGGTCTACTAGCCCACATATTTC 

ACTGTGGTCTAGGAAACGTTTTTATTTAGAAACATGTATCATATTGCCTCATAGTTTCTC 

CTTCCTCTAACACAGGAAGCTTGCTGTGTGGGCCTGGAGGCCGGCATCAACCCCACAGAC 
[C,G] 

ATCTCATCACAGCCTACCGGGCTCACGGCTTTACTTTCACCCGGGGCCTTTCCGTCCGAG 
AAATTCTCGCAGAGCTTACAGGTTTGCTGTTGATTTACAGAAAGGGGAAATGAGTGGATT 
AAGTTTTTAAATATCTGTGCATTAAGATGCTATTATGAGTTAATATTTGTTAAAAATTTT 
AAGTTTCTTTTTTTAACCCTCTCTCCTTTGGTGCTCTGGTACTTCTGTTGTGCTCTTGAG 
TTAACTGACCATTTGTGAAGTTCTCTGGCCCCTCAGGTAAAAGTTTAAAACAGGTTGGTG 

9349 CAGGAGAAAGGAAGGAAGTGGCACATGTGTGGGTTGCCAGTTTATTGCTTCTGGTTTGGG 

CCTTCCACTCTGTATTTTGGGGGAAAATAGCTACTTTCTCTGGTTATTAATGACAGGGTC 

TACTAGCCCACATATTTCACTGTGGTCTAGGAAACGTTTTTATTTAGAAACATGTATCAT 

ATTGCCTCATAGTTTCTCCTTCCTCTAACACAGGAAGCTTGCTGTGTGGGCCTGGAGGCC 

GGCATCAACCCCACAGACCATCTCATCACAGCCTACCGGGCTCACGGCTTTACTTTCACC 
[C,T] 

GGGGCCTTTCCGTCCGAGAAATTCTCGCAGAGCTTACAGGTTTGCTGTTGATTTACAGAA 
AGGGGAAATGAGTGGATTAAGTTTTTAAATATCTGTGCATTAAGATGCTATTATGAGTTA 
ATATTTGTTAAAAATTTTAAGTTTCTTTTTTTAACCCTCTCTCCTTTGGTGCTCTGGTAC 
TTCTGTTGTGCTCTTGAGTTAACTGACCATTTGTGAAGTTCTCTGGCCCCTCAGGTAAAA 
GTTTAAAACAGGTTGGTGCTATAAAATCACAGTAGGTTTGGTTATCATTCT^AGCATGCCA 

9350 AGGAGAAAGGAAGGAAGTGGCACATGTGTGGGTTGCCAGTTTATTGCTTCTGGTTTGGGC 

CTTCCACTCTGTATTTTGGGGGAAAATAGCTACTTTCTCTGGTTATTAATGACAGGGTCT 

ACTAGCCCACATATTTCACTGTGGTCTAGGAAACGTTTTTATTTAGAAACATGTATCATA 

TTGCCTCATAGTTTCTCCTTCCTCTAACACAGGAAGCTTGCTGTGTGGGCCTGGAGGCCG 

GCATCAACCCCACAGACCATCTCATCACAGCCTACCGGGCTCACGGCTTTACTTTCACCC 
[G,A] 

GGGCCTTTCCGTCCGAGAAATTCTCGCAGAGCTTACAGGTTTGCTGTTGATTTACAGAAA 
GGGGAAATGAGTGGATTAAGTTTTTAAATATCTGTGCATTAAGATGCTATTATGAGTTAA 
TATTTGTTAAAAATTTTAAGTTTCTTTTTTTAACCCTCTCTCCTTTGGTGCTCTGGTACT 
TCTGTTGTGCTCTTGAGTTAACTGACCATTTGTGAAGTTCTCTGGCCCCTCAGGTAAAAG 
TTTAAAACAGGTTGGTGCTATAAAATCACAGTAGGTTTGGTTATCATTCAAGCATGCCAG 

11066 TCCTAAGATGTTTGTAACTGGCCAGAAAACCCAGAAAAGTCCAGGGTATCATCTGGATGG 

AACATCTGAAGGAAACTAAGTGACTAGAGAGTAGGAAAAGCTGGAAAGGTTGAAGCACAT 

GGAACTAGTGAAAGGACAAGGAGAAACATGTGTTTGCCTGGAGGGACAGGTACTTAGACG 

ACTGAACTGGCCTCTGTGTTCTAATGGTTGAGCCTCAGAGTACATATTTGGGGTGCGGTT 

TGGTTTGCTTTGTAGAGTTGGTTTGTTCTGCACATGTGTATGTTCTGCCATTTCCAGGAC 
[G,A] 

AAAAGGAGGTTGTGCTAAAGGGAAAGGAGGATCGATGCACATGTATGCCAAGAACTTCTA 
CGGGGGCAATGGCATCGTGGGAGCGCAGGTAGTCAAGGACGAGGATTGTGTGCTGCTTTA 
GATTTGGCCCTGGACTTTGTCTTGAAAAACCTTTCACAGCCCCAGACAACTTTTCCTGAA 
GCTAGTACAGCCATGTGCTGCACAGTGACGCTTTGGTCAATGTCGCATATATGATGTTGG 
ACCCATAAGATTATAATGGAGCTGAAAAATTCCTGTCGCCTAGTGATGTTGTAGTGGCAC 



FIGURE 31 



11128 CATCTGAAGGAAACTAAGTG ACT AGAG AGT AGGAAAAGCTGGAAAGGTTGAAGCACATGG 

AACTAGTGAAAGGACAAGGAGAAACATGTGTTTGCCTGGAGGGACAGGTACTTAGACGAC 
TGAACTGGCCTCTGTGTTCTAATGGTTGAGCCTCAGAGTACATATTTGGGGTGCGGTTTG 
GTTTGCTTTGTAGAGTTGGTTTGTTCTGCACATGTGTATGTTCTGCCATTTCCAGGACGA 
AAAGGAGGTTGTGCTAAAGGGAAAGGAGGATCGATGCACATGTATGCCAAGAACTTCTAC 
[G, A] 

GGGGCAATGGCATCGTGGGAGCGCAGGTAGTCAAGGACGAGGATTGTGTGCTGCTTTAGA 
TTTGGCCCTGGACTTTGTCTTGAAAAACCTTTCACAGCCCCAGACAACTTTTCCTGAAGC 
TAGTACAG.CCATGTGCTGCACAGTGACGCTTTGGTCAATGTCGCATATATGATGTTGGAC 
CCATAAGATTATAATGGAGCTGAAAAATTCCTGTCGCCTAGTGATGTTGTAGTGGCACAA 
CACATTACCTTTTCTACGTTTAGGTACACAAATATTTTGCCTACAGGATTCAGTAGAGTC 

11135 AGGAAACTAAGTGACTAGAGAGTAGGAAAAGCTGGAAAGGTTGAAGCACATGGAACTAGT 
GAAAGGACAAGGAGAAACATGTGTTTGCCTGGAGGGACAGGTACTTAGACGACTGAACTG 
GCCTCTGTGTTCTAATGGTTGAGCCTCAGAGTACATATTTGGGGTGCGGTTTGGTTTGCT 
TTGTAGAGTTGGTTTGTTCTGCACATGTGTATGTTCTGCCATTTCCAGGACGAAAAGGAG 
GTTGTGCTAAAGGGAAAGGAGGATCGATGCACATGTATGCCAAGAACTTCTACGGGGGCA 
[A,GJ 

TGGCATCGTGGGAGCGCAGGTAGTCAAGGACGAGGATTGTGTGCTGCTTTAGATTTGGCC 
CTGGACTTTGTCTTGAAAAACCTTTCACAGCCCCAGACAACTTTTCCTGAAGCTAGTACA 
GCCATGTGCTGCACAGTGACGCTTTGGTCAATGTCGCATATATGATGTTGGACCCATAAG 
ATTATAATGGAGCTGAAAAATTCCTGTCGCCTAGTGATGTTGTAGTGGCACAACACATTA 
CCTTTTCTACGTTTAGGTACACAAATATTTTGCCTACAGGATTCAGTAGAGTCACATGCT 

11143 AAGTGACTAGAGAGTAGGAAAAGCTGGAAAGGTTGAAGCACATGGAACTAGTGAAAGGAC 
AAGGAGAAACATGTGTTTGCCTGGAGGGACAGGTACTTAGACGACTGAACTGGCCTCTGT 
GTTCTAATGGTTGAGCCTCAGAGTACATATTTGGGGTGCGGTTTGGTTTGCTTTGTAGAG 
TTGGTTTGTTCTGCACATGTGTATGTTCTGCCATTTCCAGGACGAAAAGGAGGTTGTGCT 
AAAGGGAAAGGAGGATCGATGCACATGTATGCCAAGAACTTCTACGGGGGCAATGGCATC 
[G,A] 

TGGGAGCGCAGGTAGTCAAGGACGAGGATTGTGTGCTGCTTTAGATTTGGCCCTGGACTT 
TGTCTTGAAAAACCTTTCACAGCCCCAGACAACTTTTCCTGAAGCTAGTACAGCCATGTG 
CTGCACAGTGACGCTTTGGTCAATGTCGCATATATGATGTTGGACCCATAAGATTATAAT 
GGAGCTGAAAAATTCCTGTCGCCTAGTGATGTTGTAGTGGCACAACACATTACCTTTTCT 
ACGTTTAGGTACACAAATATTTTGCCTACAGGATTCAGTAGAGTCACATGCTGTGCAGGG 

12486 TTACAGGCGTGAGCCACCACGCCCGGCCAGGTTTTATTTTTTAACTCTTGAATGCAGAAA 
TGTTAGTGCTTACTGGTTAAAATAGAACATAGTATTTATATATTACTTTAGTGCTTTATT 
GAAAATATCGGAGGTGGGATAAACAGAGAGATAGGGTTGGAAGGAGAGTTTGTAGCAGCA 
GTGTAATTTCTGTGTCAGATTCTGGCCAGGAGTGAAAATGCAGGGCATTAATTAGTATCT 
CCCCTCATGGATTTCTGTGGTTCCTTTCTCGGTTGTCCTTAATGTTAGGTGCCCCTGGGC 
[G,C] 

CTGGGATTGCTCTAGCCTGTAAGTATAATGGAAAAGATGAGGTCTGCCTGACTTTATATG 
GCGATGGTGCTGCTAACCAGGTAATTATGTCTCTTAACTTCCCAAAAACAGTCTTATTTT 
CAAAGTCTTTAATATTTACAGTTGAATTTCTA7VAGAAGTAGCATATTGCTTATTAGGTGA 
AATAGCAAGTCCTATGGCTAGCTCAAATTTGGTTGACTTATGGCCAGATTAGAGATTGAC 
CTCTTAGCGTTGTTTCACAAGAGACTTACGGGGGCACATTCCTGTGAAGGAGCTCACCTT 

12558 CTGGTTAAAATAGAACATAGTATTTATATATTACTTTAGTGCTTTATTGAAAATATCGGA 
GGTGGGATAAACAGAGAGATAGGGTTGGAAGGAGAGTTTGTAGCAGCAGTGTAATTTCTG 
TGTCAGATTCTGGCCAGGAGTGAAAATGC7VGGGCATTAATTAGTATCTCCCCTCATGGAT 
TTCTGTGGTTCCTTTCTCGGTTGTCCTTAATGTTAGGTGCCCCTGGGCGCTGGGATTGCT 
CTAGCCTGTAAGTATAATGGAAAAGATGAGGTCTGCCTGACTTTATATGGCGATGGTGCT 
[G, A] 

CTAACCAGGTAATTATGTCTCTTAACTTCCCAAAAACAGTCTTATTTTCAAAGTCTTTAA 
TATTTACAGTTGAATTTCTAAAGAAGTAGCATATTGCTTATTAGGTGAAATAGCAAGTCC 
TATGGCTAGCTCAAATTTGGTTGACTTATGGCCAGATTAGAGATTGACCTCTTAGCGTTG 
TTTCACAAGAGACTTACGGGGGCACATTCCTGTGAAGGAGCTCACCTTTGCTCTACATCA 
GTGCTTGGCAAAGGCCCTGTGGTAAAGGACCTCCCCACAACCTATTGC^^AACAATACAG 

13376 TCCTTACTGCTAGCATGGCCCCAAAGAAACAAGTAGTAGTTGGTTTGTCACCTTCCTTAG 
TTGCAAGAGTATGATGCCTGCTACTTCTCCTCCACCACCCACCCCGCTTTCCCTCACCAC 
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CCAAAGCTCGGTTTTAGAAGAGGAGGCTTTCTGTGCTTTATGAAAGCTTTCTGTGCCAGG 
CAGAGCAGCAGCTGTTAGAGATGATGAAGCCTGGAGAAAGAAGCCAAATGAAACCCCTTT 
TCGTAACTACTTCCAGGGCCAGATATTCGAAGCTTACAACATGGCAGCTTTGTGGAAATT 
[T,C,A] 

CCTTGTATTTTCATCTGTGAGAATAATCGCTATGGAATGGGAACGTCTGTT 

13378 CTTACTGCTAGCATGGCCCCAAAGAAACAAGTAGTAGTTGGTTTGTCACCTTCCTTAGTT 
GCAAGAGTATGATGCCTGCTACTTCTCCTCCACCACCCACCCCGCTTTCCCTCACCACCC 
AAAGCTCGGTTTTAGAAGAGGAGGCTTTCTGTGCTTTATGAAAGCTTTCTGTGCCAGGCA 
GAGCAGCAGCTGTTAGAGATGATGAAGCCTGGAGAAAGAAGCCAAATGAAACCCCTTTTC 
GTAACTACTTCCAGGGCCAGATATTCGAAGCTTACAACATGGCAGCTTTGTGGAAATTAC 
[C,T] 

TTGTATTTTCATCTGTGAGAATAATCGCTATGGAATGGGAA 

16233 GTGGGACAAGTGACACTTCTTTAGAACCTTACATCTAAATCAAAGCAGCAAGCAAAAACT 
TGGCCCCTGTTGTCGGTAATGCCAGGGAAGCCATGTGACTCACCAGTGTACGGTTTTCTA 
GAAAAGACAGAAGCAGTTATTACAGAATGTTAGGCTGCGTTCTGGTATTTTGAAAGTATA 
ACAACAACTCTGCCACGCCTATAGTGACATAAGCATTGGTATGCCCCTTTGTTTCAGAAA 
CACACTTCTGTATTTCACCTCATTGGGACAATCCAACCCCATATCATGTTTCATCACGCC 
[G,C] 

TCCTTGCTCTACTGGAACTGCTCTTACTGATCGATTACTACTTTTCCCTCCCCATAGTTA 
CCGTACACGAGAAGAAATTCAGGAAGTAAGAAGTAAGAGTGACCCTATTATGCTTCTCAA 
GGACAGGATGGTGAACAGCAATCTTGCCAGTGTGGAAGAACTAAAGGTACAGTCACTTGT 
TCATGGTGGTTTGAAGGTTGGCTTTAAAAGTTGCCACCCCTGGGTGGCCACAGAGTTTGT 
GTGGGTTCCTCCAAGCCCAGAAAGTGATGTCCTGGGACATAAATAGTTCCATAGTTCCAA 

16354 AAAAGACAGAAGCAGTTATTACAGAATGTTAGGCTGCGTTCTGGTATTTTGAAAGTATAA 
CAACAACTCTGCCACGCCTATAGTGACATAAGCATTGGTATGCCCCTTTGTTTCAGAAAC 
ACACTTCTGTATTTCACCTCATTGGGACAATCCAACCCCATATCATGTTTCATCACGCCG 
TCCTTGCTCTACTGGAACTGCTCTTACTGATCGATTACTACTTTTCCCTCCCCATAGTTA 
CCGTACACGAGAAGAAATTCAGGAAGTAAGAAGTAAGAGTGACCCTATTATGCTTCTCAA 
[G, A] 

GACAGGATGGTGAACAGCAATCTTGCCAGTGTGGAAGAACTAAAGGTACAGTCACTTGTT 
CATGGTGGTTTGAAGGTTGGCTTTAAAAGTTGCCACCCCTGGGTGGCCACAGAGTTTGTG 
TGGGTTCCTCCAAGCCCAGAAAGTGATGTCCTGGGACATAAATAGTTCCATAGTTCCAAA 
GTCCCTTGGGGTGGGGGCTTTTCCTTTAGTTTCCTCTATTCAAAATTGTATTACTCTTCA 
GATTTCAGATTTTGGTGGACTGTGAACCACCATCACAGTGGCAAAGCCCCCACAGTAGTA 

16377 GAATGTTAGGCTGCGTTCTGGTATTTTGAAAGTATAACAACAACTCTGCCACGCCTATAG 
TGACATAAGCATTGGTATGCCCCTTTGTTTCAGAAACACACTTCTGTATTTCACCTCATT 
GGGACAATCCAACCCCATATCATGTTTCATCACGCCGTCCTTGCTCTACTGGAACTGCTC 
TTACTGATCGATTACTACTTTTCCCTCCCCATAGTTACCGTACACGAGAAGAAATTCAGG 
AAGTAAGAAGTAAGAGTGACCCTATTATGCTTCTCAAGGACAGGATGGTGAACAGCAATC 
[T,G] 

TGCCAGTGTGGAAGAACTAAAGGTACAGTCACTTGTTCATGGTGGTTTGAAGGTTGGCTT 
TAAAAGTTGCCACCCCTGGGTGGCCACAGAGTTTGTGTGGGTTCCTCCAAGCCCAGAAAG 
TGATGTCCTGGGACATAAATAGTTCCATAGTTCCAAAGTCCCTTGGGGTGGGGGCTTTTC 
CTTTAGTTTCCTCTATTCAAAATTGTATTACTCTTCAGATTTCAGATTTTGGTGGACTGT 
GAACCACCATCACAGTGGCAAAGCCCCCACAGTAGTATGGTTCTTTTTTCCTAAAAGTAT 
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